Investigating cross-lingual training for offensive language detection
نویسندگان
چکیده
منابع مشابه
Co-Training for Cross-Lingual Sentiment Classification
The lack of Chinese sentiment corpora limits the research progress on Chinese sentiment classification. However, there are many freely available English sentiment corpora on the Web. This paper focuses on the problem of cross-lingual sentiment classification, which leverages an available English corpus for Chinese sentiment classification by using the English corpus as training data. Machine tr...
متن کاملOffensive Language Detection Using Multi-level Classification
Text messaging through the Internet or cellular phones has become a major medium of personal and commercial communication. In the same time, flames (such as rants, taunts, and squalid phrases) are offensive/abusive phrases which might attack or offend the users for a variety of reasons. An automatic discriminative software with a sensitivity parameter for flame or abusive language detection wou...
متن کاملLanguage model adaptation using cross-lingual information
The success of statistical language modeling techniques is crucially dependent on the availability of a large amount training text. For a language in which such large text collections are not available, methods have recently been proposed to take advantage of a resource-rich language, together with cross-lingual information retrieval and machine translation, to sharpen language models for the r...
متن کاملCross-Lingual News Group Recommendation Using Cluster-Based Cross-Training
Many Web news portals have provided clustered news categories for readers to browse many related news articles. However, to the best of our knowledge, they only provide monolingual services. For readers who want to find related news articles in different languages, the search process is very cumbersome. In this paper, we propose a cross-lingual news group recommendation framework using the cros...
متن کاملAutomated Hate Speech Detection and the Problem of Offensive Language
A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: PeerJ Computer Science
سال: 2021
ISSN: 2376-5992
DOI: 10.7717/peerj-cs.559